The Abject Failure of Keyword IR for Mathematics Search: Berkeley at NTCIR-10 Math

نویسندگان

  • Ray R. Larson
  • Chloe Reynolds
  • Fredric C. Gey
چکیده

This paper demonstrates that classical content search using individual keywords is inadequate for mathematical formulae search. For the NTCIR10 Math Pilot Task, the authors used a standard indexing by content word for search coupled with search for components of mathematical formulae. This was followed by formula extraction from the top ranked documents. Performance was terrible, even for partial relevance. The further inclusion of some manual reformulation of topics into queries did not improve retrieval performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Similarity Search for Mathematics: Masaryk University Team at the NTCIR-10 Math Task

This paper describes and summarizes experiences of Masaryk University team MIRMU with the mathematical search performed for the NTCIR pilot Math Task. Our approach is the similarity search based on enhanced full text search utilizing attested state-of-the-art techniques and implementations. The variability of used Math Indexer and Searcher (MIaS) system in terms of the math query notation was t...

متن کامل

NTCIR-11 Math-2 Task Overview

•Mathematics plays a fundamental role in Science, Technology, and Engineering (learn from Math, apply for STEM) •Mathematical knowledge is rich in content, sophisticated in structure, and technical in presentation! •There is a lot of documents with maths – 120.000 journal articles per year in pure/applied math, 3.5 Million overall – 50 million science articles in 2010 with a doubling time of 8-...

متن کامل

The MCAT Math Retrieval System for NTCIR-10 Math Track

NTCIR Math Track targets mathematical content access based on both natural language text and mathematical formulae. This research describes the participation of MCAT group in the NTCIR math retrieval subtask and math understanding subtask. We introduce our mathematical search system that is capable of formula search, and full-text search. We also introduce our mathematical description extractio...

متن کامل

Berkeley at NTCIR-2: Chinese, Japanese, and English IR experiments

This paper reports on the work of Berkeley group at the second NTCIR workshop on Japanese & English IR and Chinese IR. A number of runs were submitted on all subtasks in the two main tasks. Our main focus on the Japanese monolingual subtask was on comparing the retrieval effectiveness of different segmentation methods. The experimental results show the bigram indexing outperformed the word-base...

متن کامل

Math Indexer and Searcher under the Hood: Fine-tuning Query Expansion and Unification Strategies

This paper summarizes the experience of Math Information Retrieval team of Masaryk University (MIRMU) with the NTCIR-12 MathIR arXiv Main Task and its subtasks. We based our approach on the MIaS system. Based on NTCIR-11 Math-2 Task relevance judgements, we developed an evaluation platform. Using this platform we rigorously evaluated combinations of new features and picked the most promising on...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013